Title

How adequate an account of experimental findings in word
recognition do direct access and search models provide?

Greg Detre

Friday, November 03, 2000

L&C V � Lexical Processing

Lexical processing refers to recognising a word and its meaning both in spoken and written form. This means identifying the words in whatever form they are presented and bringing up their associated syntactic and semantic information. This requires separating the auditory or visual input into words, on the basis of pauses/gaps and other features of the signal. Most of the experimental work is conducted on written, as opposed to spoken input, because the printed word allows us to control for speaker variations, pronounciation, dialect, noise etc. This is all perceptual information. Information like the topic of discussion, accompanying images (e.g. in an advertisement), cultural norms etc is contextual information. Lexical processing is one of the early steps in language: in comprehension it involves lexical/word recognition, and in production involves lexical selection. Lexical access is differentiated from word recognition in some models as throwing up a number of candidates from the mental lexicon from which the final one is identified, i.e. selected. The mental lexicon is like a dictionary: it stores information about the pronounciation, spelling, part of speech and points to the meaning of a word.

Experiments in word recognition take two main forms: lexical decision and pronounciation. In lexical decision, subjects are presented with a string of letters and they have to decide whether this forms a legitimate word in the English language. This experiment only demands that the subject simply perform a search through the mental lexicon, without any further processing required. However, Henderson (1982) has considered that some of the main experimental findings in lexical decision tasks may be explicable in terms of processing that occurs after this search through the mental lexicon.

In pronounciation tasks, subjects are asked to say out loud words or non-words, with the pause before speaking being measured. However, there may be a non-lexical way of saying the words using pronounciation rules, or by phonological analogy with other similar-looking words (though experiments with phonological recording address this to an extent). In this case, it may not be necessary for subjects to access the pronounciation data stored for that word in the mental lexicon � lexical decision tasks force the subjects to refer to the mental lexicon in order to identify whether or not the word is a real one.

There are various experimental findings which any model of lexical processing needs to account for:

frequency effect � more common words (measurable for instance in terms of the number of occurrences in a linguistic corpus) are responded to faster in lexical decision and pronounciation tasks

word/non-word effect � non-words that don�t conform to the rules of English orthography/phonotactics are rejected very fast. Also, the closer a non-word is to a real word, the harder it is to reject

context effect � if previous words or a pair of words are related in context, then lexical decision is faster than if the context is neutral, e.g. �xxxxx butter� as opposed to �bread butter�. It is not clear exactly though what types of clues context offers.

degradation or stimulus quality effect � ink blots and anything else that affects the perceptual information adversely obviously makes lexical processing harder and slower

word-superiority effect � it seems that we are better at recognising a letter as part of a word than on its own, implying that we don�t just look at the letters and then see which word they seem to form, but that the emerging list of candidate words affects our guesses about letters

Morton�s (1969, 1979) direct access logogen model was originally designed to explain word recognition of very briefly presented words, rather than a timed lexical decision task, but has since been extended with a reasonable degree of success. Visual or auditory input is fed into a set of feature-detectors. Visual feature-detectors look for lines and curves in various places, and auditory feature-detectors are looking for the salient acoustic patterns that distinguish phonemes. Every time a feature is found that the word being searched contains, the �logogen� for that word registers it in its variable counter. A word is considered to be �recognised� when the feature count exceeds its logogen threshold.

This direct access model can explain most of the experimental findings to an encouraging degree, but it does fall over in a few places. The frequency effect is easily explained by thresholds that are lowered every time that logogen fires. The context effect can be understood if we allow contextual information as well as perceptual information to increase the logogen count. Thus, a contextual feature-detector of some description will increase the �bread� logogen if �butter� has been detected nearby. Similarly, the stimulus quality effect can be seen either as a preprocessing perceptual cleanup that takes longer (Mitchell, 1982), or as part of the feature detection � either way we would expect the stimulus quality to affect word recognition but to affect most models in the same way, because they all rely on some degree of perceptual processing. The word-superiority effect is explicable if we remember that the feature detectors are looking for features, rather than letters specifically. Thus the logogen for a word might fire faster and more reliably than the logogen for a letter because the aggregate features of a word identify it more certainly than the more limited feature set of a single letter.

However, this basic logogen model does have certain problems accounting for some experimental findings. Because it explains both context and frequency effects in a similar way (they are both additive, though one reduces the threshold and the other adds a contextual bonus on top of the perceptual feature detector input), we would expect them to show similar patterns of interaction with stimulus quality. In fact, when Becker and Killion (1977) tested this prediction, they found that although context and degradation showed the kind of multiplicative effect expected, the effects of frequency and degradation added together. Norris (1984) has since found that frequency and degradation can interact, but that the interaction context and frequency is larger and more robust than can be explained by the logogen model.

Secondly, this model cannot account for the experimental findings with non-words, where we can easily identify non-words that are highly unlike common words, primed words or any words in the English language. Coltheart et al. (1977) posited a deadline of some sort, so that if the logogen doesn�t fire in time, it assumes that it is being presented with a non-word. But, some non-words are confusingly like real words, and take much longer to discriminate, so Coltheart suggests that unless the feature detectors are detecting at a certain rate, the deadline expires sooner. Forster prefers an account which involves multiple logogens. This is perhaps the most satisfactory account, because there is some scope for incorporating information about patterns that extend across multiple letters, as our lexical processing system surely must in order to identify words that are orthographically legitimate.

The major alternative to the direct access model can be characterised best by Forster�s (1979) search model. Importantly, this model does not require us to search through our entire mental lexicon, but only through one of three access files. There is one access file for written (visual feature detectors), one for spoken (auditory feature detectors) and one for producing words (containing the syntactic/semantic properties necessary to decide which words could come next). The contents of these access files are ordered by frequency, so that the words at the top of the access files are searched first, with this order presumably changing as a result of which words have been accessed most recently. This account is intended to be more parsimonious than the direct access model, which requires a search through the entire mental lexicon, because the access files are further divided into bins, which might be divided up by letter or sound � only for non-words does the system have to search through all the bins. When a word has been recognised in the access file, a reference is made to the master file, i.e. the mental lexicon, where the information is duplicated and expanded, including a sort of thesaurus of related words.

The frequency effect is easily enough explained because the access files are ordered with the most frequently occurring words at the top. However, given the further suborganisation of the bins, there would need to be a parallel search in order for the most frequent words to be consistently found quickest. Context effects are perhaps least well accounted for by the search model � unlike the direct access model, the search model needs to specify what sort of information is contained in the files and how it should be represented, and the basic semantic pointers stored in the mental lexicon that Forster suggests just would not be sufficient. Unless that list was extremely expansive, the priming effects that we observe for distantly related words would not occur. More importantly still, context effects extend beyond just direct semantic similarities, yet such effects would never arise from this model. The duplication this model requires would defeat the purpose of the search model by its inefficiency of representation � we would expect that there is a means of explaining context effects in terms of an ingenious representation, probably connectionist, rather than having to posit an entirely new area of the lexicon.

Isn�t all this talk of �searching through the mental lexicon� a bit of a high-level way of seeing things? Although both the direct access and search models could be seen as connectionist, they make more sense if we imagine the search through the mental lexicon as being a serial search through a list, like a computer file, rather than the distributed representation of the brain. In the brain, the mental lexicon will be spread over a large mass of neurons, encoded as the weighted synaptic connections between them. A �search� through such a massively parallel system will take very little time at all. Remembering this makes a supposedly more parsimonious system like the search model less attractive, which only requires us to search restricted areas of the lexicon, rather than the entire lexicon. In fact, there is a larger point to note here � a distributed representation is so very unlike the high-level, symbolic representations that we use when we discuss storage systems and searches that we find it very hard to imagine how the mental lexicon might really work. And this greatly affects how we view the models that incorporate it.

Both models detailed here struggle especially with the complex context effects that we observe, especially inhibitory context effects. This seems to me to be as a result of the simplicity of the way that they account for meaning. Knowing the meaning of a word can be abstractly defined as being able to use that word in the right place and time � given how complex the rules determining this are, it seems unlikely that this information could be so simply encoded in the mental lexicon, and divorced from the other properties of the word. Given all of the above criticisms, neither the direct access or search models described here really do justice to the power of our word recognition ability.